|
Audio Codecs
Codecs for Encoding/Decoding Music Formats
Rockbox has previously only worked on music players with codecs provided by hardware. Now we hope to get it running on players where all codec work is made in software. On these players, Rockbox will aim to support as many file formats as possible, but initial development will focus on playback of MP3 files and recording to WAV files.
This page describes various audio codecs and provides links to resources that would be useful to a developer wanting to add support for that format to Rockbox.
NOTE: We can start working on this before we can run code on the target - see RockboxAudioAPIProposal.
Overview of Audio CODECs
The basic format of an audio file in a computer is a Wave (.WAV) file. This contains uncompressed PCM audio and a 4-minute song at CD quality will be about 40MB in size. Audio CODECs are programs that reduce this filesize and can be split into two main categories - "lossy" and "lossless".
The Hydrogenaudio wiki contains more information about the codecs discussed here.
Lossy CODECs
A "lossy" codec (e.g. MP3, OGG Vorbis, AAC) uses knowledge of human hearing to try and discard as much of the original audio signal as possible, whilst attempting to make the audio sound as close as possible to the original. These codecs typically achieve a filesize of 10%-20% of the original.
Format | Decoder(s) | Encoder(s) |
MP3 | MAD - Helix From what I can tell from the website, the Helix decoder is MP3 only (i.e. no layer-I or layer-II support), is written in C++, and is licensed under the Real Networks Public Source License (RPSL). For those reasons, and the fact that MAD is tried and tested, I think we should stick with MAD -- DaveChapman The fixed int Helix MP3 decoder is not written in C++, it's written in C. RPSL is an OSI license and GPL compatible. It is tried and tested - Motorola use it in their phones. -- AlastairS The RPSL is most definately not GPL compatible. They do list it as a compatible license, but that more or less just means that the GPL is RPSL compatible, rather than the other way round. Which is of course a huge joke. The license even has a note to that effect. -- JonasHaeggqvist Another option is Stephane TAVENARD's MPEGDEC library as ported to the Coldfire here. I have set this up to work with Rockbox, but the gains weren't as great as we hoped. I have no further time/desire to work on it, but if anyone feels like picking it up, it's available here. -- ThomJohansen | Moved to EncoderDiscussionMP3 |
OGG/Vorbis | Vorbis is a free format and is supported by the original iRiver firmware, the code is BSD licensed and this should be fairly straightforward to put into the firmware. Tremor is their integer implementation of the decoder, more information is available on xiph.org. Download here a snapshot of the subversion repository. There's a lowmem branch of Tremor which may be good on devices with low memory. See this mailing list post for details. | Also written and freely available under the BSD license, but not in integer form. |
MPC | libmusepack: Portable Musepack decoder library (including fixed point mode) by Peter Pavlovski, famous for being the foobar2000 developer (AKA "zZzZzZz" on http://www.hydrogenaudio.org ), Kuniklo and others. The library is under Modified BSD license, here is the library C sources. For further information just look the HA dedicated thread or the musepack.net forum. You can find some other details on HA (HERE). #MPC on irc.musepack.net is the place for "support". "you will find that there are problems with seeking" | Open Source (LGPLd), but floating point only |
AAC (and HE-AAC???) | FAAD2 (Free Advanced Audio Decoder version2, with HE-AAC support) from http://www.audiocoding.com comes under GPL (unlike the outdated FAAD version which is under LGPL and doesn't allow HE-AAC decoding). FAAD(2) is a both mpeg2 and MPEG4 AAC decoder! NOTE: There is currently some controversy with an apparent GPL-incompatible change to the FAAD2 license - see this thread at HA for details (especially comment #35) RealNetworks released an open source fixed point HE-AAC decoder. Note: the license doesn't appear to be GPL compatible - any opinions? DanHollis: The Podzilla guys seem to think its GPL compatible. More details HERE RichardOBrien: The RCSL/RPSL (RealNetworks Public Source License) under which the AAC decoder was released, at Helix Community, is compatible to GPL but not the other way around. PatrickSchuetz: I agree with RichardOBrien, for more details: HERE For MP4 parsing, FAAD's mp4ff library should be used. Another tricky issue is AAC gapless playback - a discussion of the subject is here | FAAC (Free Advanced Audio Coder) from http://www.audiocoding.com is under "Lesser GPL". FAAC doesn't provide HE-AAC encoding but it's a both MPEG2 and MPEG4 AAC coder too. Note that FAAC is also the name of the whole sourceforge project (including FAAC, FAAD and FAAD2), don't mistake! (more details HERE). No integer version available. |
WMA | FFMPEG contains a wma decoder. The xmms-wma plugin contains a stripped-down version of ffmpeg and would be a good place to start for an iRiver implementation. Problem: This is a floating point implementation and will not run in real-time on the iRiver. I tested it on the Hauppauge MediaMVP (running Linux) and it only ran in about 20% of realtime. -- DaveChapman | No Open Source encoder, integer or not |
A/52 (aka AC3) | liba52 is a GPL'ed implementation with an integer-only mode that would run without problems on the iRiver's hardware AC3 is the most common audio format for DVDs, so support for this format would allow you to rip the audio from a DVD and play it directly on your iRiver without re-encoding. An obvious next step (if technically possible) would be "AC3 pass-thru" via the optical digital output to a standalone AC3 surround decoder. | ffmpeg contains a GPL'd ac3 encoder (ac3enc.c) Quality is very, very suboptimal - RobertoAmorim |
Speex | Speex is a codec geared towards speech compression. A fixed point version is currently being developed. | ??? |
Lossless CODECs
A "lossless" codec (e.g. FLAC) performs the same function as "winzip" - i.e. it compresses an audio file without discarding any of the information. These codecs typically achieve a filesize of 50%-60% of the original filesize, but the audio playback will be bit-for-bit identical to the original file.
Format | Decoder(s) | Encoder(s) |
WAVE (.wav) | ".wav" is the de-facto container format for storing uncompressed PCM audio, but may contain compressed data (most common formats are pcm, adpcm, alaw, mulaw, and dvi_adpcm). Obviously, no encoding or decoding needs to be done (but byte-swapping may be needed - WAV files are little-endian), but we do need to parse the headers and extract the data from the file - so a "codec" needs to be written. Details, . Official specs, Sample WAV files . A patch to support formats other than PCM is maintained by FredericDevernay. | As decoder. |
FLAC | FLAC - integer only | FLAC According to this topic and the changelog, FLAC 1.1.2 has a compiler define (FLAC__INTEGER_ONLY_LIBRARY) that builds the whole library (decoder, encoder, etc.) with integer only. The remaining problem is the usage of 64-bit ints in the code. But it appears that someone is working on that. |
Shorten (.shn) | Shorten source is available from this website, but the license isn't GPL compatible and including it would be a major hassle. See discussion in the irc logs from Feb 13th 2005. FFmpeg has a LGPL decoder. | To be completed |
Wavpack (.wv) | Licensed under the Modifed BSD license | Does both lossy and lossless encoding. |
Monkey's Audio (.ape) | Monkey's Audio 3.99 SDK. For further information look at the Monkey's Audio developer page. jmac, a java implementation of Monkey's Audio is also an option (under LGPL) Problem: The Monkey's Audio license is not open source compatible, since you are not allowed to use it without express written permission from the author. Discussions have been held about relicensing under LGPL, but no progress since may '04. -- BjornStenberg The codec is heavily x86-centric with lots of x86 assembly to speed up parts of the code - particularly a neural network. Unless it's very heavily optimized for 68K, it won't run real-time. And some compression modes (Extra high, Insane) probably won't run no matter how much you optimize it - RobertoAmorim | Monkey's Audio 3.99 SDK. jmac, a java implementation of Monkey's Audio is also an option (under LGPL) |
ALAC | Apple Lossless Audio Codec. http://craz.net/programs/itunes/alac.html | missing |
TTA | True Audio: http://tta.sf.net/ | GPL |
Other CODECs
There are other audio "CODECs" that don't fit into the "lossy" or "lossless" categories above. These are different because the source for the files was not a .WAV file, but rather they are formats used by music composers to store computer-generated music.
Format | Decoder(s) | Encoder(s) |
SID | The sid format is music from Commodore 64 games and other productions. I don't know much about the format, but a good starting point would probably be sidplay2 which includes a standalone library. I don't know whether or not this is floating point. A thing to note is that sid files have no notion of playing time. They are simply programs that go on forever, either with music that loops or silence. Traditionally this has been fixed by looking up playtimes in a database of file-md5 hashes which includes more or less every song available on the net. Furthermore, each file may contain many subtunes. DanHollis: sidplay2 appears to be completely integerized. MartinArver: Unfortunatley, libsidplay, which is the lib for sidplay2, is written in c++. I have been looking at an older version of libsidplay(1.36.59), this has fixpoint-math. But, as it is written in c++ it seems we have to build the toolchain with c++ enabled for this to compile. | Encoding not possible. |
SPC | The SPC format is music from Super Nintendo (Super Famicom in Japan) games and other productions. There are a multitude of players for the SPC format, most of them listed here. As with most sound emulator formats the sound is not absolutely accurate, but the SNESamp Winamp plugin is generally regarded as the one which represents the sound best. I'm not sure if SNESamp is floating point (ChrisRobinson: or if the nature of the Winamp plugin is indeed useful for converting - I'm no programmer), but if not there are multiple other players. SNESamp actually uses the SNESapu emulator for SPC decoding. It is indeed the most accurate SPC700 APU emulator publicly available. Unfortunately, all emulation code is written in x86 assembly (NASM) - RobertoAmorim Unlike the sid format SPCs are single songs, which have ID666 tag support, and every SPC file is 64kb in size. However, the more popular RSN format acts more like a sid file and contains all songs from a game (RSNs are simply RAR files which contain SPCs [which have a 90% average compression rate] named to RSN). Some players support RSN, others I believe do not. | Encoding not possible. |
Tracker formats | This is actually a whole range of more or less similar sequencer formats. There are a few opensource libraries available, which all support a lot of formats. Mikmod is for some reason often used, but I've been told that it lacks support for many features in some formats. This is a C library. Modplug (the project also releases libmodplug) should have better support and it seems that it has very limited use of floats. However, libmodplug is in C++. People talking about Mikmod vs. Modplug. Another option might be XMP which is C. There's also DUMB which is said to be an accurate module player library (also C). Any thoughts on which has the best support? Modplug is probably not feasible. | Encoding not possible. |
MIDI | I have written a simple MIDI player for Linux in C++, now rewriting it in C in hopes of making it work with Rockbox. Currently the C port is able to allocate memory for the file, decode it, sequence events, and interpret them enough to play the file using the sound card an a very simple sinewave-based synthesizer I threw together. I am hoping to find an adlib emulation engine that I can import into this thing, and possibly a wavetable engine if I can find good patches. It plays sound and at this state, it may actually work on the iRiver if someone ports the sound output routine in pctest.c to write to the iRiver DSP. I guess the MIDI codec would have to be more of a plug-in, as it loads the entire file at once and then plays it from memory... You may want to look for the Gravis UltraSound patches; they are floating around and used by software MIDI players such as TiMidity++ I have looked at TiMidity++ as well as the music engine used by ScummVM... Does anyone know of a good description of the Gravis UltraSound patch format? Those patches store a good deal of information, such as waveform looping, envelope, etc.. but I cannot find a guide that explains the fields in the file. Any ideas? There seems to be downloadable Gravis patch info here. All right.. playback, looping, interpolation (No more ghetto lowpass filter!), drums, panning, pitch wheel and all that work fine. I have just added envelope support. It works but could probably use some more exhaustive testing. I don't know how well envelope stuff will work on the target, given the amount of extra work it puts on the processor compared with the difference it makes to the output. I guess at this point the code needs to be built for and tested on the target.. but I don't really know what kind of functions it uses for file I/O, etc. Maybe someone can help me with that.. someone who actually has an iriver, etc. Updated synth sample here. This plugin needs a separate soundset to work. This is available here. Extract its contents into the /.rockbox directory. Warning: file is around 22MB in size. Update- the codec runs on target at around 78% realtime. This is only C optimization. I have paused working on this because to optimize it further, I need an actual iRiver, and unfortunately I do not have one. I do plan on getting an H3X0 when Rockbox starts supporting it, and then I will try to work on this codec some more. Anyone else with a device is welcome to work on this thing.. | Encoding not possible. |
Current status
Last updated: 18 August 2005
You might want to check IriverTesting for a detailed view on codecs on Irivers.
Codec | Status on iRiver | Realtime | Plays on iRiver | Seeking |
Lossy CODECs |
Mpeg-audio | CVS code runs fairly efficiently, but there's still room for more optimisation. | | | |
Ogg/vorbis | Code with some optimizations in CVS. | | | |
MPC | Code with some optimizations in CVS. Playback in Rockbox works, but is not yet real-time | | See status | |
A/52 (AC3) | Code with some optimizations in CVS. | | | |
Lossless CODECs |
WAV | Works. | | | |
FLAC | Code with some optimizations in CVS. Very close to real-time, but not quite there yet. | | | |
ALAC | Working decoder implemented, not yet in CVS. Needs optimisation. | | | |
Wavpack | CVS code runs very efficiently, both encoding and decoding. | | | |
Other CODECs |
MIDI | Code with some optimizations in CVS. | | | |
MOD | DUMB checked into CVS - not yet functional. | | | |
Realtime means that the codec is able to decode a file as fast as it needs to be played (ie. a one minute file is decoded in one minute). Codecs should be a good deal faster than this to allow for buffering, crossfading etc. though.
Revision r1.100 - 27 Aug 2005 - 21:48 GMT - DaveChapman
|
Copyright © 1999-2005 by the contributing authors.
|
|
|